AITopics | multiple-choice test

Collaborating Authors

multiple-choice test

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Delving into the Reversal Curse: How Far Can Large Language Models Generalize?

Neural Information Processing SystemsFeb-11-2026, 02:38:26 GMT

A prime example is the recently debated "reversal curse", which surfaces when models, having been trained on the fact "A is B", struggle to generalize this knowledge to infer that "B is A ".

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Sports (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

36b6180f3dab4025ba763e853b044814-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 23:15:19 GMT

dataset, experiment, subset, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Sports (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Delving into the Reversal Curse: How Far Can Large Language Models Generalize?

Lin, Zhengkai, Fu, Zhihang, Liu, Kai, Xie, Liang, Lin, Binbin, Wang, Wenxiao, Cai, Deng, Wu, Yue, Ye, Jieping

arXiv.org Artificial IntelligenceNov-22-2024

While large language models (LLMs) showcase unprecedented capabilities, they also exhibit certain inherent limitations when facing seemingly trivial tasks. A prime example is the recently debated "reversal curse", which surfaces when models, having been trained on the fact "A is B", struggle to generalize this knowledge to infer that "B is A". In this paper, we examine the manifestation of the reversal curse across various tasks and delve into both the generalization abilities and the problem-solving mechanisms of LLMs. This investigation leads to a series of significant insights: (1) LLMs are able to generalize to "B is A" when both A and B are presented in the context as in the case of a multiple-choice question. (2) This generalization ability is highly correlated to the structure of the fact "A is B" in the training documents. For example, this generalization only applies to biographies structured in "[Name] is [Description]" but not to "[Description] is [Name]". (3) We propose and verify the hypothesis that LLMs possess an inherent bias in fact recalling during knowledge application, which explains and underscores the importance of the document structure to successful learning. (4) The negative impact of this bias on the downstream performance of LLMs can hardly be mitigated through training alone. These findings offer a novel perspective on interpreting LLMs' generalization through their intrinsic mechanisms and provide insights for developing more effective learning methods. Our code and data are available at https://github.com/alibaba/thinking_bias.git.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.18808

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(12 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Leisure & Entertainment > Sports (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

The End of Scantron Tests

The Atlantic - TechnologySep-19-2023, 12:00:00 GMT

Through funding cuts and bumps, integration and resegregation, panics and reforms, world wars and culture wars, American students have consistently learned at least one thing well: how to whip out a No. 2 pencil and mark exam answers on a sheet printed with row after row of bubbles. Whether you are an iPad baby or a Baby Boomer, odds are that you have filled in at least a few, if not a few hundred, of these machine-graded multiple-choice forms. They have long been the key ingredient in an alphabet soup of standardized tests, both national (SAT, ACT, TOEFL, LSAT, GRE) and local (SHSAT, STAAR, WVGSA). And they are used in both $50,000-a-year academies and the most impoverished public schools, where the classic green or blue Scantron answer sheets can accompany daily quizzes in every subject. Machine grading, now synonymous with the brand Scantron the way tissues are with Kleenex, is so popular because it can provide rapid and straightforward results for millions of students.

artificial intelligence, multiple-choice test, student, (17 more...)

The Atlantic - Technology

Country:

North America > United States > North Carolina (0.05)
North America > United States > Massachusetts (0.05)
North America > United States > Kansas (0.05)

Industry:

Education > Educational Setting (0.90)
Education > Assessment & Standards > Student Performance (0.87)

Technology: Information Technology > Artificial Intelligence > The Future (0.40)

Add feedback

The End Of Multiple Choice? The Quest To Create Accurate Robot Essay Graders

AITopics Original LinksJan-18-2017, 11:26:05 GMT

What's the best way to prove you "know" something? A. Multiple choice tests B. Essays C. Interviews D. None of the above Go ahead: argue with the premise of the question. Oh yeah, you can't do that on multiple-choice tests. Essays can often better gauge what you know. Writing is integral to many jobs. But despite the fact that everyone can acknowledge that they're a more useful metric, we don't demand students write much on standardized tests because it's daunting to even imagine grading millions of essays.

artificial intelligence, competition, vander ark, (13 more...)

AITopics Original Links

Genre: Contests & Prizes (0.31)

Industry:

Education > Assessment & Standards > Student Performance (0.51)
Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.31)

Technology: Information Technology > Artificial Intelligence > Robots (0.53)

Add feedback